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Abstract — We analyze a novel architecture for caching popular 
video content to enable wireless device-to-device collaboration. 
We focus on the asymptotic scaling characteristics and show 
how they depends on video content popularity statistics. We 
identify a fundamental conflict between collaboration distance 
and interference and show how to optimize the transmission 
power to maximize frequency reuse. 

Our main result is a closed form expression of the optimal 
collaboration distance as a function of the model parameters. 
Under the common assumption of a Zipf distribution for content 
reuse, we show that if the Zipf exponent is greater than 1, it is 
possible to have a number of D2D interference-free collaboration 
pairs that scales linearly in the number of nodes. If the Zipf 
exponent is smaller than 1, we identify the best possible scaling 
in the number of D2D collaborating links. Surprisingly, a very 
simple distributed caching policy achieves the optimal scaling 
behavior and therefore there is no need to centrally coordinate 
what each node is caching. 

I. Introduction 

Wireless mobile data traffic is expected to increase by 
a factor of 40 over the next five years, from the current 
93 Petabytes to 3600 Petabytes per month in the next five 
years |1|. This explosive demand is fueled mainly by mobile 
video traffic that is expected to increase by a factor of 65, 
and become the by far dominant source of data traffic. Since 
the available specttum is physically limited and the spectral 
efficiency of current systems is already close to optimum, 
the main method for meeting this increased demand is to 
bring content closer to the users. Femto base stations [2| are 
currently receiving a lot of attention for this purpose. 

A significant bottleneck in such small-cell architectures is 
that each station requires a high-rate backhaul link. Helper 
stations that replace high-rate backhaul with storage J3J 
Q, can ameliorate the problem, but still require additional 
infrastructure and have limited flexibility. 

To circumvent these problems, we recently proposed the 
use of device-to-device (D2D) communications combined with 
video caching in mobile devices (5) (7). The approach is based 
on three key observations: (i) Modern smartphones and tablets 
have significant storage capacity, (ii) video has a large amount 
of content reuse, i.e., a small number of video files accounts 
for a large fraction of the traffic, (iii) D2D communication can 
occur over very short distances thus allowing high frequency 
reuse. Our proposed architecture functions as follows: users 
can collaborate by caching popular content and utilizing local 
D2D communication when a user in the vicinity requests a 
popular file. The base station can keep track of the availability 
of the cached content and direct requests to the most suitable 
nearby device; if there is no suitable nearby device, the BS 



supplies the requested video file directly, via a traditional 
downlink transmission. Storage allows users to collaborate 
even when they do not request the same content at the 
same time. This is a new dimension in wireless collaboration 
architectures beyond relaying and cooperative communications 
as in |6) (5) and references therein. 

A D2D video network can be analyzed using a protocol 
model, which means that only two devices that are within 
a "collaboration distance" of each other can exchange video 
files, while devices with a larger distance do not create any 
useful signal, but also no interference, for each other. The 
choice of the collaboration distance represents a tradeoff be- 
tween two counteracting effects: decreasing the collaboration 
distance increases the frequency reuse and thus the potential 
throughput, but on the other hand decreases the probability that 
a device can find a requested file cached on another device 
within the collaboration distance. In J?) we described this 
tradeoff and provided numerical solutions for the optimum 
distance, and the resulting system throughput. 

In the current paper we concentrate on the analytical 
treatment of the scaling behavior of a D2D network, i.e., 
how the throughput scales as the number of nodes increases. 
For conventional ad-hoc networks, scaling behavior has been 
derived in the seminal paper by Gupta and Kumar |8j has 
further received significant attention (e.g. see ]9)-fTT)). This 
architecture not only differs from ad-hoc or collaborative 
networks in its application, but also shows a fundamentally 
different behavior due to its dependence on the video reuse 
statistics. We provide a closed form expression of the optimal 
collaboration distance as a function of the content reuse 
distribution parameters. 

We model the request statistics for video files by a Zipf 
distribution which has been shown to fit well with measured 
YouTube video requests (fT2) |13|. We find that the scaling 



laws depend critically on the Zipf parameter, i.e., on the 
concenttation of the request disttibution. We show that if the 
Zipf exponent of the content reuse distribution is greater than 
1, it is possible to have a number of D2D interference-free 
collaboration pairs that scales linearly with the number of 
nodes. If the Zipf exponent is smaller than 1, we identify 
the best possible scaling in the number of D2D collaborating 
links. Surprisingly, a very simple distributed caching policy 
achieves the optimal scaling behavior and therefore there is 
no need to centrally coordinate what each node is caching. 
For Zipf exponent equal to 1, we find the best collaboration 
distance and the best possible scaling. 

The remainder of this paper is organized as follows: In Sec- 




Fig. 1. Random geometric graph example with collaboration distance r(n). 



tion[n]we set up the D2D formulation and explain the tradeoff 
between collaboration distance and interference. Section [Til] 
contains our two main theorems, the scaling behavior for Zipf 
exponents greater, smaller than and equal to 1. In Section IV 
we discuss future directions, open problems and conclusions. 
Finally, the Appendix contain the proofs of our theorems. 



II. Model and Setup 



In this section, we discuss the fundamental system model; 
for a discussion of the assumptions, and justifications of 
simplifications, we refer the interested reader to |7). 

Assume a cellular network where each cell/base station 
(BS) serves n users. For simplicity we assume that the cells 
are square, and we neglect inter-cell interference, so that 
we can consider one cell in isolation. Users are distributed 
randomly and independently in the cell. We assume that the 
D2D communication does not interfere with the base station 
that can serve video requests that cannot be otherwise covered. 
For that reason, our only concern is the maximization of the 
number of D2D collaboration links that can be simultaneously 
scheduled. We henceforth do not need to consider explicitly 
the BS and its associated communications. 

The communication is modeled by a standard protocol 
model on a random geometric graph (RGG) G(n,r(n)). In 
this model users are randomly and uniformly distributed in a 
square (cell) of size 1. Two users (assuming D2D communica- 
tion is possible) can communicate if their euclidean distance 
is smaller than some collaboration distance r(n) j8j, fl4) . The 
maximum allowable distance for D2D communication r(n) is 
determined by the power level for each transmission. Figure 
[T] illustrates an example of an RGG. 

We assume that users may request files from a set of size 
m that we call a "library". The size of this set should increase 
as a function of the number of users n. Intuitively, the set of 
YouTube videos requested in Berkeley in one day should be 
smaller than the set of requested in Los Angeles. We assume 
that this growth should be sublinear in n, e.g. m could be 



0(Iog(n))Q 

Each user requests a file from the library by sampling 
independently using a popularity distribution. Based on several 
studies, Zipf distributions have been established as good mod- 
els for the measured popularity of video files p2| , fl3) . Under 
this model, the popularity of the ith popular file, denoted by 
fi, is inversely proportional to its rank: 



fi 



l 



1 < i < m. 



(1) 



The Zipf exponent ~f r characterizes the distribution by con- 
trolling the relative popularity of files. Larger j r exponents 
correspond to higher content reuse, i.e., the first few popular 
files account for the majority of requests. 

Each user has a storage capacity called cache which is 
populated with some video files. For our scaling law analysis 
we assume that all files have the same size, and each user 
can store one file. This yields a clean formulation and can be 
easily extended for larger storage capacities. 

Our scheme works as follows: If a user requests one of 
the files stored in neighbors' caches in the RGG, neighbors 
will handle the request locally through D2D communication; 
otherwise, the BS should serve the request. Thus, to have D2D 
communication it is not sufficient that the distance between 
two users be less than r(n); users should find their desired 
files locally in caches of their neighbors. A link between 
two users will be called potentially active if one requests a 
file that the other is caching. Therefore, the probability of 
D2D collaboration opportunities depends on what is stored 
and requested by the users. 

The decision of what to store can be taken in a distributed 
or centralized way. A central control of the caching by the BS 
allows very efficient file-assignment to the users. However, if 
such control is not desired or the users are highly mobile, 
caching has to be optimized in a distributed way. The simple 
randomized caching policy we investigate makes each user 
choose which file to cache by sampling from a caching 
distribution. It is clear that popular files should be stored with 
a higher probability, but the question is how much redundancy 
we want to have in our distributed cache. 

We assume that all D2D links share the same time-frequency 
transmission resource within one cell area. This is possible 
since the distance between requesting user and user with the 
stored file will typically small. However, there should be no 
interference of a transmission by others on an active D2D link. 
We assume that (given that node u wants to transmit to node 
v) any transmission within range r(n) from v (the receiver) 
can introduce interference for the u — v transmission. Thus, 
they cannot be activated simultaneously. This model is known 
as protocol model; while it neglects important wireless propa- 
gation effects such as fading [15], it can provide fundamental 

'We use the standard Landau notation: f(n) = 0(g(n)) and f(n) = 
Q(g(n)) respectively denote |/(n)| < cig(n) and |/(n)| > C2</(n) for 
some constants ci,C2. f(n) = 0(g(n)), stands for f(n) = 0(g(n)) and 
f(n) = fl(g(n)). Little-o notation, i.e., f(n) = o(g(n)) is equivalent to 



lim„ 



/(") 



0. 




•■-Y I 17 L 



\ 



Fig. 2. Random geometric graph, yellow and green nodes indicate receivers, 
transmitters in D2D links. Gray nodes get their request files from the BS. 
Arrows show all possible D2D links. 



insights and has been widely used in prior literature [8]. 

To model interference given a storage configuration and user 
requests we start with all potential D2D collaboration links. 
Then, we construct the conflict graph as follows. We model 
any possible D2D link between node u as transmitter to node 
v as a receiver with a vertex u — v in the conflict graph. Then, 
we draw an edge between any two vertices (links) that create 
interference for each other according to the protocol model. 
Figure [3] shows how the RGG in Figure [2] is converted to 
the conflict graph. In Figure [2] receiver nodes are green and 
transmitter nodes are yellow. The nodes that should receive 
their desired files from the BS are gray. A set of D2D links is 
called active if they are potentially active and can be scheduled 
simultaneously, i.e., form an independent set in the conflict 
graph. The random variable counting the number of active 
D2D links under some policy is denoted by L. 

Figure [3] shows the conflict graph and one of maximum in- 
dependent sets for the conflict graph. We can see that out of 14 
possible D2D links 9 links can co-exist without interference. 
As is well known, determining the maximum independent 
set of an arbitrary graph is computationally intractable (NP 
complete fT6[). Despite the difficulty of characterizing the 
number of interference-free active links, we can determine the 
best possible scaling law in our random ensemble. 

III. Analysis 

A. Finding the optimal collaboration distance 

We are interested in determining the best collaboration 
distance r(n) and caching policy such that the expected 
number of active D2D links is maximized. Our optimization 
is based on balancing the following tension: The smaller the 
transmit power, the smaller the region in which a D2D com- 
munication creates interference. Therefore, more D2D pairs 



Fig. 3. conflict graph based on Figure[2]and one of maximum independent set 
of the conflict graph; pink vertices are those D2D links that can be activated 
simultaneously. 



can be packed into the same area allowing higher frequency 
reuse. On the other hand, a small transmit power might not be 
sufficient to reach a mobile that stores the desired file. Smaller 
power means smaller distance and hence smaller probability 
of collaboration opportunities. 

We analyze the case where the nodes do not possess power 
control with fast adaptation, but rather all users have the same 
transmit power that depends only on the node density. We then 
show how to optimize it based on the content request statistics. 
Our analysis involves finding the best compromise between the 
number of possible parallel D2D links and the probability of 
finding the requested content, as discussed above. Our results 
consist of two parts. In the first part (upper bound), we find 
the best achievable scaling for the expected number of active 
D2D links. In the second part (achievability), we determine 
an optimal caching policy and r(n) to obtain the best scaling 
for the expected number of active links E[L\. 

The best achievable scaling for the expected number of ac- 
tive D2D links depends on the extend of content reuse. Larger 
Zipf distribution exponents correspond to more redundancy in 
the user requests and a small number of files accounts for 
the majority of video traffic. Thus, the probability of finding 
requested files through D2D links increases by having access 
to few popular files via neighbors. 

We separate the problem into three different regions depend- 
ing on the Zipf exponent: -f r > 1, j r < 1, and j r = 1. For 
each of these regions, we find the best achievable scaling for 
E[L] and the optimum asymptotic r(n) denoted by r opt (n). 
We also show that for j r > 1 and j r < 1 regions a simple 
distributed caching policy has optimal scaling, i.e., matches 
the scaling behavior that any centralized caching policy could 
achieve. This caching policy means that each device stores 
files randomly, with a properly chosen caching distribution, 
namely a Zipf distribution with parameter j c . For j r = 1, we 
present an optimal centralized caching policy. 

Our first result is the following theorem: 

Theorem 1: If the Zipf exponent j r > 1, 

i) Upper bound: For any caching policy, E[L] = 0(n), 
ii) Achievability: Given that c\\ h < r opt (n) < C2 1 1 



[^] and using a Zipf caching distribution with exponent 

7c > 1 then E[L] = 6(n). 
The first part of the theorem [T] is trivial since the number of 
active D2D links can at most scale linearly in the number of 
users. The second part indicates that if we choose r opt (n) = 

®(\J~^) and 7 C > 1, E[L] can grow linearly with n. There is 
some simple intuition behind this result: We show that in this 
regime users are surrounded by a constant number of users in 
expectation. If the Zipf exponent 7 C is greater than one, this 
suffices to show that the probability that they can find their 
desired files locally is a non-vanishing constant as n grows. 
Our proof is provided in the Appendix [A] 

For the low content reuse region 7 r < 1, we obtain the 
following result: 

Theorem 2: If j r < 1, 

i) Upper bound: For any caching policy, E[L] = O(^) 
where 77 = 

ii) Achievability: If c 3 t/^ < r opt (n) < c 4 ^/^ and 
users cache files randomly and independently according 
to a Zipf distribution with exponent 7c , for any exponent 
77 + e, there exists 7 C such that E[L] = Q( m "+, ) where 
< e < g and 7 C is a solution to the following equation 

(1 - lr)lc 



1 - 7r + 7c 



Our proof is provided in the Appendix [B] 

We show that when there is low content reuse, linear scaling 
in frequency re-use is not possible. At a high level, in order 
to achieve the optimal scaling, on average a user should be 
surrounded by Q(m v ) users. Comparing with the first region 
where 7 r > 1, we can conclude that when there is less 
redundancy, users have to see more users in the neighborhood 
to find their desired files locally. 

Theorem 3: If 7 r = 1 

i) Upper bound: For any r(n), E[L] = 0(^^^) 



ii) Achievability: Given that ci 
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that 



log(m) 
n log log(m) 



lo ^ m > < r ( n ) < 

n log log(rn) — \ / — 

, there exists a centralized strategy such 



e[l] = e( 



nlog log(m) . 
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IV. Discussion and Conclusions 

As mentioned in Sec. I, the study of scaling laws of the 
capacity of wireless networks has received significant attention 
since the pioneering work by Gupta and Kumar |8) (e.g. see 
||9)-|fTT)). The first result was pessimistic: if n nodes are trying 
to communicate (say by forming n/2 pairs), since the typical 
distance in a 2D random network will involve roughly 0(y / n) 
hops, the throughput per node must vanish, approximately 
scaling as 1/y/ri. There are, of course, sophisticated arguments 
performing rigorous analysis that sharpens the bounds and 
numerous interesting model extensions. One that is particularly 
relevant to this project is the work by Grossglauser and 

2 c and CiS are positive constants that do not depend on n. 



Tse [10] that showed that if the nodes have infinite storage 
capacity, full mobility and there is no concern about delay, 
constant (non-vanishing) throughput per node can be sustained 
as the network scales. 

Despite the significant amount of work on ad hoc networks, 
there has been very little work on file sharing and content 
distribution over wireless ( |3), fT7| ) beyond the multiple 
unicast traffic patters introduced in (8). Our result shows 
that if there is sufficient content reuse, caching fundamentally 
changes the picture: non- vanishing throughput per node can be 
achieved, even with constant storage and delay, and without 
any mobility. 

On a more technical note, the most surprising result is 
perhaps the fact that in Theorem 2, a simple distributed policy 
can match the optimal scaling behavior E[L] = O(^). This 
means that even if it were possible for a central controller 
to impose on the devices what to store, the scaling behavior 
could not improve beyond the random caching policy (though, 
of course, the actual numerical values for finite device density 
could be different). Further, for both regimes of j r , the 
distributed caching policy exponent 7 C should not match the 
request Zipf exponent 7r , something that we found quite 
counter intuitive. 

Overall, even if linear frequency re-use is not possible, we 
expect the scaling of the library m to be quite small (typically 
logarithmic) in the number of users n. In this case we obtain 
near-linear (up to logarithmic factors) growth in the number of 
D2D links for the full spectrum of Zipf exponents. Our results 
are encouraging and show that device-based caching and D2D 
communications can lead to drastic increase of wireless video 
throughput; and that the benefits increase as the number of 
participants increases. This in turn implies that the highest 
throughput gains are achieved in those areas where they are 
most needed, i.e., where the devices are most concentrated. 

Appendix A 
Proof of Theorem 1 

The first part of the theorem is easy to see since the number 
of D2D links cannot exceed the number of users. Next, we 
show the second part of the theorem. 

For the second part of the theorem, we introduce virtual 
clusters and we show that the number of virtual clusters that 
can be potentially active, called good clusters, scales like the 
number of active links. To find the lower bound for good 
clusters, we limit users to communicate with neighbors in 
the same cluster. Then, we express the probability of good 
cluster as function of stored files by users within the cluster. 
Excluding self-requests, i.e., when users find their request files 
in their own caches, we find a lower bound for good clusters. 
We further define a value for each cluster which is the sum 
of probability of stored files by users. Then we express the 
probability of goodness as a function of value of clusters. 
Using Chernoff bound, we finalize our proof. 

A. Active links versus good clusters 



We divide the cell into 



r(n) 



virtual square clusters. Figure 



|4(a)| shows the virtual clusters in the cell. The cell side is 



normalized to 1 and the side of each cluster is equal to 
Thus, all users within a cluster can communicate with each 
other. Based on our interference model, in each cluster only 
one link can be activated. When there is an active D2D link 
within a cluster, we call the cluster good. But not all good 
clusters can be activated simultaneously. According to protocol 
model, one good cluster can at most block 16 clusters (see 



/•(«) 



Figure 4(b) I. The maximum interference happens when a user 
in the corner of a cluster transmits a file to a user in the 
opposite corner. So, we have 

E[G] 



E[L}> 



(2) 



(16 + 1) 

where E[G] is the expected number of good clusters. 

Since the number of active links scales like the number of 
good clusters, to prove the theorem it is enough to show that 
constant fraction of virtual clusters are good. This is because 

r(n) = 0(y^) and there are 0(n) virtual clusters in the cell. 

B. Limiting users 

Since we want to find the lower bound for E[L], we can 
limit users to communicate with users in virtual clusters they 
belong to. Hence, 



E[G] > ^ ^Pr[good|fc] Pr[K = k], 



(3) 



k=0 



where —^K 

r{n)- 

the number of users in the cluster. 



is the total number of virtual clusters. K is 
which is a binomial 
random variable with n trials and probability of r v » 
K = B(n, r( 2 ' ). Pt[K = k] is the probability that there are 
k users in the cluster and Pr[good|fc] is the probability that 
the cluster is good conditioned on k. 

C. Probability of goodness and stored files 

To show the result, we should prove that the summation in 
Q, i.e., the probability that a cluster is good, does not vanish 
as n goes to infinity. The probability that a cluster is good 
depends on what users cache. Therefore, 



fe=0 



E 



Pr[good|fc, uj] Pr[w], 



(4) 



{u> I k j" 



where w is a random vector of stored files by users in the 
cluster and \u>\ denotes the length of vector u). The zth element 
of uj denoted by u>i £ {1, 2,3,..., m} indicates what user i 
in the cluster stores. 

For each uj, we define a value: 



(5) 



where uj = U'^l^j 



and U is the union operation. Actually 
v(uj) is the sum of popularities of the union of files in uj. 
The cluster is considered to be good if at least a user i in the 
cluster requests one of the files in uj — {uii}. 
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Fig. 4. a) Dividing cell into virtual clusters, b) In the worst case, a good 
cluster can block at most 16 clusters. In the dashed circle, receiving is not 
possible and in the solid circle, transmission is not allowed. 



D. Excluding self request 

A user might find the file it requests in its own cache; in this 
case clearly no D2D communication will be activated by this 
user. We call these cased self-requests. Accounting for these 
self-requests, the probability that user i finds its request files 
locally within the cluster is (v(u>) — f Ui ). Thus, we obtain: 

Pr[good|fc,o;] > 1 - (l - (v(u) - max/^))'. (6) 

Let us only consider cases where at least one user in the 
cluster caches file 1 (the most popular file). Then, from (|4]) 
and ([6}, the following lower bound is achieved: 



E[G}>— = k] 

v ; k=l 



J2[l-(l-(v(uj)-h)) k ] PrH. 



(7) 



where x = {uj | \u>\ — k and 1 e tD}. 

E. Probability of goodness and value of clusters 

Instead of taking expectation with respect to u>, we take 
expectation with respect to v, i.e., the value of a cluster. Then, 

„ n 

^ ' k=l 



r\n) 1 ^— ' 
v ' fc=i 

where is the event that at least one of k users in the cluster 
caches file 1 and E v [] is the expectation with respect to v. 
Let A\ h for 1 < h < k denote the event that h users out of 
k users in the cluster cache file 1. Then, we get: 

2 - 

v ; fe=i 



J2E v [(v-h)\Al h }xPr[Ai h ] 



(8) 



h=l 



where Pr[Aj h ] = 



(pi)^! — pi) k h and pj represents 



the probability that file j is cached by a user based on Zipf 
distribution with exponent 7 C . To calculate E v [(v — fi)\A k h ], 
we define an indicator function lj for each file j > 2. lj 
is equal to 1 if at least one user in the cluster stores file j. 
Hence, 

m 

E v [(v-f 1 )\Al h } = E[J2fAMlh] 

rn 

= Y J f J (i-{i-p ] ) k - h ). 

3=2 

F. Chernoff bound 

To show that the probability of a cluster is good is not 
vanishing, we use Chernoff bound. First, we limit the interval 
k to an interval around its average. By substituting E v [(v — 
/Oiling, 

v > fee/ 



h = l j=2 



■ Pj ) k - h )Pr[Ai h ], (9) 



where for any < S < 1 the interval I = [nr(n) 2 (\ — 
<5)/2, nr(n) 2 (l + 6)/2]. Define k* S / such that it minimizes 



the expression in the last line of (9 
k* is 9(1). Then from we have: 
2 

r(n) 

fc* m 

x£[Pr[<fc]5>(l-(l 



Since r(n) = 6(*M) 



P,)^) 



(10) 



> 



h=l 

2 



J'=2 



(l-2cxp(-nr(n) 2 <5 2 /6)) 



r(n) 2 

fc*pi(l+<5i) m 

x ^ [Pr[A k 1 ' h )J2m-(l-P^f- h ) 

h=k*pi(l-6i) 3=2 

(11) 

where < Si < 1. We apply the Chernoff bound in 
(fio| to derive (TTTl p8). Since the exponent nr(n) 2 <5 2 /6 



is 8(1), we can select the constant c\ such that the term 
1 — 2exp (— nr(n) 2 (5 2 /6) becomes positive. 

Let us define h* € [fc*pi(l — <Si),fc*pi(l + such that 
it minimizes the inner summation of |TT| , i.e., YHj=2 ~ 
(1 — Pj) fe _ ' 1 ). From ( lj, pi is 1 1 m ^ where function H is 
defined in lemma [T] in Appendix Some preliminary lemmas. 
Lemma [T] implies that p\ = and as a result, /i* is also 



0(1). Using the Chernoff bound for random variable h in ( 1 1 
we get: 

E[G] > (1 - 2cxp (-nr(n) 2 6 2 /6)) 

r(n) 



x (1 - 2cx P (-k* Pl 5 2 /3))Y,m - (1 -Pj) k *- h ')- 

3=2 

(12) 

fc* — /i* should be greater than 1 which results in a constant 
lower bound for c±. The second exponent, i.e., fc*pi<5 2 /3 
is 6(1). Therefore, the term (l - 2 cxp {-k*piS 2 /3>)) is a 
positive constant if c\ is large enough. Further, the summation 
in ( fT2| ) satisfies 

m m 

Y.f^-^-Pif- h *)>Y.hPi- 

3=2 3=2 

To show that E[G] scales linearly with n, the term /ift' 
should not be vanishing as n goes to infinity. Using part (iv) 
of lemma|T| we can see that if 7, , 7 C > 1, 53^=2 fiPj ~ ®(-*-)' 

Appendix B 
Proof of Theorem 2 

To show the first part of the theorem, like the proof of 
theorem 1, we use virtual clusters. We show that the number 
active links can be at most equal to the number of good 
clusters. We state the probability of goodness as a function 
of stored files. To be more precise, we express this probability 
as a intersection of some decreasing events. Then, we use 
FKG inequality, to find an upper bound for probability of 
goodness. Finally, we divide the whole range of r(n) into 
four non overlapping regions and show the upper bound for 
all regions. 

A. Active links versus good clusters 

To show the first part of the theorem, as in proof of the 
theorem 1, we divide the cell into r 2 n y virtual square clusters. 
All users within a cluster can communicate with each other. 
Based on the protocol model, in each cluster only one link can 
be activated. A stated before, when there is an active D2D link 
within a cluster, we call the cluster good. In the best case, all 
the good clusters can be activated simultaneously. Hence, 

E[L] < E[G], 

where E[G] is average number of good clusters. All users can 
look for their desired files not only in their own clusters but in 
the caches of all users in their vicinities. The maximum area 
that can be covered by all users in a cluster cannot be larger 




Definition 1: (Increasing event). A random variable X is 
increasing on (f2, F) if X(oj) < X(u>') whenever u> < us'. It 
is decreasing if —X is increasing. 

We assume that u> < u>' if the value of u> is less than the 
value of uj' , i.e., 

v{lo) < v(lu'). 

where the value of co is defined in Q. Thus, according 
to this definition, event Ai(uo) for any 1 < i < k is an 
increasing event. Applying the FKG inequality for correlated 
and decreasing events Ai(ui)s |l9): 



Fig. 5. Maximum area covered by all users within a cluster (blue square) 



Pr[Ai(w) n A 2 {u) n . . . n A k {u)\ > Pr[(.Ai(w)]*. (15) 
From ( p~4] > and ([15), we obtain: 



than ar(n) 2 where a = (A^ + 2) 2 
in Figure BJ. Therefore, 

2 



(the area of dashed square 



E[L] < 



r(n) 2 ^ 
y 1 k=o 



Prfeoodl^Pr^ = k] 



(13) 



where K is the the number of users in dashed square 
(called maximum square) in Figure [5] which is binomial 
random variable with n trials and probability of ar(n) 2 , 
K = B(n,ar(n) 2 ). 

B. Probability of goodness and stored files 

Pr [good | k] is the probability that a cluster is good condi- 
tioned on k and it depends on what users in the maximum 
square stores denoted by u>. 

Pr[good|fc] = Pr[good|fc, u] Pr[w] 

Let's define an event Ai(u) that user i finds its request either 
in the cache of its neighbors or its own cache. 

Pr[good|fc,u;] < Pr^w) U A 2 {lj) U . . . U A k (u)} 

= i - Pt[Ai(uj) n a 2 {lo) n . . . n A k (u>)] 

(14) 

Events Ai(uj) and Aj(tS) for j ^ i are dependent since they 
both depend on w. The probability that event Ai(u>) happens 
is: 

m 

Pr[A i (u)]=Y f fih, 

3=1 

where fj is the probability that user i requests file j. lj is an 
indicator function for file j and it is one if file j £ lj. It is 
easy to check that v(ui) in |5]l is equal to Pr[Aj(w)] for any 
i. 

C. Increasing events and FKG inequality 

To find an upper bound for intersection of dependent events 
Ai(ui)s in ( 14 1, we first show that they are decreasing events. 
Then, we use the FKG inequality for decreasing events |19|. 



Pr[good|fc,o;] < 1 - PrpiHf 

<i-(i-£^) fc - 



(16) 
(17) 



To derive (17i, we used the fact that the probability of event 



Ai(uo) is maximized if the k most popular files is in u>. The 



obtained upper bound in (17) does not depend on cj. Hence, 



Pr[good|fc]<l-(l-£;/i)* 

3=i 



(18) 



In the following, we will consider four non overlapping regions 
for r(n) and for each region, we will prove the first part of 
the theorem. 



D. First region 

We first consider the region r(n) = O(^J^). From 

<nu>, 

rt n k 



13 i and 



m<^J2[ l -( 1 -J2f^ k ] p ^ K = ^ < 19a > 

v ; fe=0 j=l 



v ; fe=i j=i 



(19b) 



Using part (iii) of lemma 1 the second summation fj — 

_ 1,1— -ir 



2A^. Thus, 



4 - k 2 -ir 

r m r * — ' m 1 ~ r 

v ' fe=o 



< 



r(n) 2 m 1 t 



j— J2k 2 Pr[K = k] 



k=0 



For the Binomial random variable K = B(n,ar(n) 2 ), 
E[K 2 } = (anr(n) 2 ) 2 + anr(n) 2 (l - ar(n) 2 ) 



Therefore, 



E[L] < 



r(n) 2 m 1 t 
in 



— ((anr(n) 2 ) 2 + anr(n) 2 (l — ar(?i) 2 )) 



(a 2 nr(n) 2 + a(l — ar(n) 2 )) 



n 



m l-7r ' 

where c is some constant. 



E. Second region 

Then, we consider the region that r{n) = fl(y and 



r(n) = O(^S^). Equation 



19a|i implies: 

k 



E[L] < 



E [i-(i-E^) fc ] pr ^ = fc ] 



0<k<k 



3=1 



r(n)' 



E [1 -(l-E /i) k ] Pr[*r = fc]. (21) 



Assuming that r(n) < 



fc>fe 3=1 

c log(m) 



we choose fc = 6aclog(m) 



where c is some constant. Note [1 — (1 — X^=i fj) ] * s an 
increasing function of k and it is less and equal to 1. Therefore, 
pi) implies, 

2 fc ° 

^]< 7 ^[i-(i-E/ J ) fc °] pr ^< fc o] 

V 7 3=1 



— Pr[if > fc ] 



(22) 



2 2 

(23) 



4 r fc 



2- 7r 



2 Pr[K > fc ] 

(24) 



We use part (iii) of lemma [TJ to derive the last equation. For 
the binomial random variable K and for any R > 6E[K], the 
Chernoff bound holds (TS) : 



Pi[K >R]<2~ 



(25) 



Applying the Chernoff bound and substituting fc in ( 24 1, we 
acquire: 

4 (6aclog(m)) 2 -^ 

^ < —r-r^ fzz 



_i_ 2~ 6aclog(m) 

r(n) 2 



4(6ac) 



2- 7 _ 



2-7 



1 (log(m)) 
r(n) 2 m 1_ T r(n) 



2 r^dac log 2 ' 



(26) 



The function f{x) 
Thus, log(m) < 1 

E[L] < 4(6ac) 



log(a:) 



is always less than i where /? > 0. 



2-7 _ 



1 



r(n) 2 
4(6ac) 2 -T 1 



^4-27 

e(— ) 



r(n) 2 m^ 




2-7 



1,1-7 



2n 
to 1 ? 



(27) 



F. Third and fourth regions 

For the third region, r(n) 
r(n) = 0{\ -). To show the upper bound for i?[X] in 



= fi( 



log(m) 



and 



this region, we follow similar procedure in the second 
region by setting fco = Qanr(n) 2 . For the last region 

r(n) = il(^A^), the total number of all virtual clusters 

= O(^r). Thus, for this range of r(n), = O(^). 



In the following, we will show the second part of the 
theorem. Similar to proof of the theorem 1, we relate the 
number of good clusters and active links. We restrict users 
to communicate with their neighbors in their clusters. We 
further limit users not to get certain files from their neighbors 
although some neighbors might store these files. In this case 
the value of a cluster is the sum of probability of stored files 
that users can get via their neighbors. By the restriction on files 
the value of cluster becomes concentrated around its mean. 
We also consider self requests in finding the lower bound. 
Applying Chernoff bound and Azuma inequality we show that 
the probability of goodness is not vanishing when a user is 
surrounded in average by irnr opt (n) 2 neighbors from which 
the result follows. 

. We should show that if we 



_ (l-7r)7c 
1 1r+la 



Define r\\ = r/ - 

choose r(n) = 6(^/ I ^p L ) I the probability that a virtual cluster 
is good does not vani sh as n grows. 

When r(n) = 8(y m ^ L ), there are 0(^ n -) virtual clusters. 
The number of active D2D links is upper bounded by the 
number of virtual clusters. Thus, E[L] = O(tS^). Then, we 

show that for c 3 J^f < r(n) < c 4 J^f, E[L] = 
To do this, we follow similar procedure in theorem [TJ We 
divide the cell into virtual clusters and we allow each user to 
look for its desired file just within its cluster. As mentioned 
before, each cluster can block at most 16 other clusters (Figure 

M i. 

G. Limiting users and excluding self request 

To find the lower bound, we even more restrict users. We 
assume that users can not get files {1, 2, . . . , q — 1} locally 
even if there are users in the cluster that cache these files 
where q = m~t<= . So, caching files {1,2, .... q — 1} doesn't 
have any value for any user in the cluster. 

E[L] is lower bounded by expression in (j2j) where the lower 
bound for E[G] is given in ffih. Similar to d6]l, we exclude 



the self requests . Thus, the probability that a cluster is good 
conditioned on k and uj is 



Pr[good|fc,w] > 1 - 1- v(u) - max f Ut (28) 

V V !■/• »•: / / 

where v(oj) = YllLqfj^-j anc ^ lj ^ s an indicator function, lj 
is one if at least one user in the virtual cluster stores file j. We 
limit ourselves to all cases in which at least one user caches 
file q. Hence, 



Pr[good|fe,o;] > l - (1 - (u( w ) - /,))* 



(29) 



H. Chernoff bound 

As in proof of theorem 1, we first limit the interval of k 
and the we use the Chernoff bound. 

By restricting k to an interval around its average, i.e., I = 
[nr{n) 2 {\ - <5)/2, nr(n) 2 (l + S)/2] where < 5 < 1 and 
applying p9| ) in Q, the following lower-bound is obtained: 

E[G]>^- 2 Y,Pr[K = k] 
rinV ' 
v ; fee/ 

x £ [l-(l-(v(a;) -/,))*] Pr[ W ], (30) 

where x = {uj | \uj\ — k and q G uj}. Let k* be 

k* 4 argmin^ [l - (1 - - / 9 )) fe ] Pr[w] 

Notice that fc* and also all € / are 6(m 7?1 ). Then, 
£[G]>^Pr[ifE/] 

x^l-(l-K,)-// Pr[ W ] 



> 



2 



r(n) 2 



(l-2exp(-nr(n) 2 5 2 /6)) 



£[l-(l -(«(«)-/,))**] Pr[ W ]. (31) 



We use the Chernoff bound in (31 1. Let A k h denote the event 
that 1 < h < k users cache file q. Then, we can rewrite the 
above lower-bound as follows: 



E[G]> 



r(n)'- 



(l-2exp(-nr(n) 2 6 2 /6)) 



> 



fe* 

E v [l - (1 - (v - f q )f \A k q ' h ] Pr[A k q ' h ] 

h=l 

-^5 (l - 2exp (-nr(n) 2 5 2 /6)) 
r(n) z v 

fe*P,(l+«i) 



X ^ - (1 - (v - f q )) k ' \A k * h ] Pr[<;] 



h=k'p q (l — 5i) 



(32) 



average of binomial random variable h and < 81 < 1 



where Pr^J = ( * ) (p q ) h (1 - Pq f \ k* Pq is the 



Define h* as 

a = axg 

E v [l-(l-(v-f q )r\Al h 



mm 

k"p 9 (l-Si)<h<k*p t (l+Si) 
k* I a k 



(33) 



Using Chernoff bound for binomial random variable h, we 
obtain: 

2 



£[G]> 



r(n)* 



(l-2cxp(-nr(n) 2 <S 2 /6)) 



(34) 



x (1 - 2ex P (k* Pq S 2 /3)) E v [l - (1 - (v - f q )f \A k q ] h ,\ 

The probability that a user caches file q is: 

l 

O'lc 

Pq 



(35) 



Em 1 
j — 1 j'Yc 

ff(7 c , l,m) 

where function is defined in lemma Q] We show in lemma 
[T] that ff(7 c , l,m) = 9(1) given that j c > 1. Thus, all 
/i e - Sx),k*p q (l + Sx)] are 6(1). By selecting 

the constant C3 large enough, the second exponential term 
(l — 2cxp (— k*p q 5 2 /3)) will be greater than zero. 

/. Probability of goodness is not vanishing 

To complete the proof, it is enough to show that the proba- 
bility that cluster is good, i.e., E v [l - (1 - (v - f q )) k " \A k * h ,} 
given in ( [34] ) does not vanish. 

E v [l-(l-(v-f q )) k *\A k * >h ,}> 

f (!_(!_(„_/ ))*•)/ v \ Ah . (v)dv (36) 

where f v \ A k' (v) is a probability distribution function of 
value v conditioned on A qh , and < t < E v [v\A k h ,]. The 
average of v conditioned on A k h , is given by: 



E v [v\A k q ; h , 



f*+ E f^-i 1 

j=q+l 



Pif- h ') (37) 



From equation (36i and since (1 — (1 — (v — f q )) ) is an 



increasing function of v, 

E v [l-(1 -(v-f q )f \A k *} 



> (l-(l-((E v [v\A qth ,]-t)-f q )) h 

Pr[\v-E v [v\A k ' h ,]\<t] 

>l-exp(-k*(E v [v\^'h'}-t-f q )) 
Pr [\v - E[v\A k q h «] \ < t] 



(38a) 



(38b) 



We show in lemma g E v [v\A k * h ,] = 6(^r). Thus, 
k*E v [v\Aq* h ,] = 6(1). Furthermore, (36 1 implies 



t = 0(E v [v\A k ; th *])=0(-^-) 



m Vi 



Similar to <|35p, we can show that f q 

_ (l-7,-)(l+7c) 



6(4 



m^2 



0{E v [v\A k qh .}) where m = 



1 fr+tc 



Thus, the exponent 



in the first term of |38b) is 6(1). To prove the result, it is where a is defined in theorem 2. The term [1-(1-£*U fj) k ] 



enough to show the second term in (38bi does not approach 
zero as n grows. By applying the Azuma Hoeffding inequality 
in lemma [3] 

2t 2 

Pr [|« - E v [v\A k qJ A\ < t] > 1 - 2exp ( - _ ^ )(/g)2 ) 

"(39) 

Due to the fact that k* = 6(m'' 1 ) and h* = 6(1), the term 
k* —h* = 6(m 1)1 ). If we select t = Q(^ r[ ), we can observe 
that the exponent jj^—^ryj y scales with m 112 ^ 2 ^ 1 ^. Hence, 
if 7 C < 2, the exponent goes to infinity as n grows. 7 C < 2 
implies that e < ^. This means that v is concentrated around 
its average with high probability if e < g and as a result, 
the second term in ( |38b) > is positive constant when n goes to 
infinity. 

Appendix C 
Proof of Theorem 3 

The proof of the first part of the theorem is similar to 
the proof of the theorem 2. E[L] is upper bounded by the 
expression in ( | 19a[ >. Next, we consider three non-overlapping 
regions for r(n) and we show the upper bound is valid for 
every r(n). 

A. First region 

First, we assume r(n) = 0( 
part (v) of lemma [T] imply, 

2 Alog(fc) 



log log(m) 



). Eqation ( 19b 1 and 



E[L}< 



r(n) 2 

y 1 k=0 



log(m) 



■ Fr[K = k] 



< 



r{n) 2 log(m) ^ 



k 2 Vr[K = k] 



< 



r(n) 2 log(m) 
2 

r(n) 2 log(m) 



E[K 2 ] 
\iptnr{n) 2 ) 2 



anr{n) 2 



2n 



< 



log(m) 

2cn 
log(m) 



\a 2 nr(n) 2 + a] 
(a 2 loglog(TO) + a) 



e ^ nloglog(m) ^ 
log(m) 



(40a) 

(40b) 

(40c) 
(40d) 
(40e) 
(40f) 
(40g) 



To derive (40f I, we use the range of r(n) 



B. Second and third regions 

Let's consider the second region for r(n). In this r egion 

r(n) = o(^/ logl ° s(m) ) and r(n) = CK^ 1 ^^). From (19ai, 



Qanr{n) 2 
k=0 



r(n) 2 
2 



"(n) 



/c-6anr(n) z 



[i-(i 



E 

3=1 



/ j ) fc ]Pr[^ = fe] 



z ^/ J ) fc ]Pr[^ = fc] 

3=1 

(41) 



is an increasing function of k, thus, 



E[L] < 



r(n) 2 
2 

r(n) 2 




6anr(n) 



Pr[K > 6anr(n) 2 



(42a) 



6anr(n) 2 

r(nr rinr 
3=1 



(42b) 



In ( |42b| >, we applied the Chemoff bound fl8) . From lemma [T] 
and the range of r(n), we obtain 



< 12an log(6anr(n) 2 ) + l + 2 2 _ 6anr( „ )2 
log(m) r(n) 



< 12cm 



+ 



log(6ac7 log(m)) + 1 
log(m) 

2~ 6ac8 log log(m) 



2n 



c 8 log log (rn) 



nloglog(m) ^ | 2n ^ \ 



log(m) c 8 loglog(m) log(m) 6QCs log(2) 

nlog log(m) 
1 log(m) j - 



For the last region, i.e., r(n) = £l(y log ^ m ) ) ; the total 
number of virtual clusters is 0{-, — 9— -r) and as a result, 

^ log(m) / ' 



In the following, we will show the second part of the 
theorem. We propose a centralized algorithm that can match 
the upper bound. The BS divi des the cell into virtual cluster of 
size r(n) = 6(w lo g( m ) y Given that there are k users in 

v / v y nloglog(m)' 

a cluster, each of them should cache one of the k most popular 
files. We show that under this caching policy, we can match the 
upper bound. To find the lower bound, we assume that users 
can just find their desired files just within clusters they belong 
to. The lower bound for E[L] and E[G] are respectively given 
in |2]i and Limiting the range of k results in 



E[G] > — VPr[good|fc] Pr[if = k]. 



(43) 



kei 



where I = [nr{n) 2 {\ - <5)/2, w(n) 2 (l - 8)/2]. Under this 
centralized caching policy the value of stored files within a 
cluster with k users is v(k) = Ylj=x fj- The cluster is good 
if at least one user within a cluster requests one of the k most 
popular files not stored in its own cache 



Pr[good|fc] > 1 - (1 - (v(u) - h)f 

> 1 - cxp -k22fj 

> ( /(logW-l) 



1 , 1 1 ' (44) 

log(m) + 1 

In the last equation we used lemma [TJ The expression in ( 44 1 
is an increasing function of k. Thus, ( j44| ) and (43 i imply 

k m in (log(A.' TO in) — 1) 



E[G] > 1 - exp 



> ^1 — exp i 



log(m) + 1 



log(m) 
x I 1 — 2 cxp I —nr(n) 2 



where k r , 



6 



nr(n) 2 (l - S)/2 = 9( 



Pr[K G /] 
(45) 



(46) 



log(' 



i — , v We use the 

aog log(m) I 

Chernoff bound to derive (B6J). As n grows, the second term in 
( ftp} goes to 1. It can be seen that the first term in ( |46]> is also 
9(1). Thus, E[G] and consequently E[L] are 9( "^g^ (m) ). 

Appendix D 
Some preliminary lemmas 

Lemma 1: i) If 7 > 1 and a = o(b), H(-f, a, b) = 

ii) If 7 < 1, a = 0(6), and a = 9(1), H(j,a,b) = Qib 1 ^). 

iii) if7r<L Ef =1 /i<2^r. 

iv) If 7cj7r >l, E£a/ift = e(i)- 

V s ) if -y - 1 V k f < los(fc)+1 and V fe f > '"s^" 1 

V) II 7 - 1, Z^j=i Jj ^ log(m) dnU 2^=2 J 3 log(m) + l' 
b 

where H{^,a,b) = E w> 



3=a 

1 



y — 



1 < « < m. 



(47) 



and /j is defined in <JTJ. 

Proof: We first prove the parts (i) and (ii) of the lemma. 
-=y is monotonically decreasing. Thus, 



H(7,a,b) > 



b(-7+ 1 ) - q-7+1 

-7 + 1 



(48) 



We also have the following inequality: 

| b 

H( 1 ,a 7 b)-^= ' 

j=a+l 

1 _ - a^ +1 ) 

£7 ~ -7+1 



6 1 

T - 



< 



(49) 



Thus, -ff ( 7 , a, 6) satisfies 

&C-7+1) _ o-7+l 



-7 + 1 



< H(j,a,b) < 



& (-7+i) _ o-7+l l 



-7+1 aT 
(50) 

Therefore, if 7 > 1, #"(7, a, 6) = 9(-i r ). Besides, if 7 < 1 
and a = 9(1), then H(j, a, b) = Gib 1 ^). 
For part (iii), using (|48]> and (|49|), we have 



h ~ H lr ,l,m 



< 



m (l-7r) - 1 
fc(l-7,) 



<2- 



m (l-7r) ' 

Next we show part (iv). From ([T]), we have: 



2 ^ 



E 



m 1 

j=2 j'7t-+7c 



Em 1 1 
, _ 3=1 J^ 7 ^j=l 7I 

i?(7 c + 7r ,2,m) 



■ff(7o l,"^)^(7r, l,m) 



(51) 



When 7c, 7 r > 1, both the nominator and the dominator of 
EJI2 fjPj are fr° m which (iv) follows. 

Since the proof of the part (v) is similar to parts (i) and (ii), 
we omit it. 

■ 

Lemma 2: If j c > 1, j r < 1, k = 9(m'' 1 ), and h = 9(1) 



_ 7c(l-7rQ „ 



where 771 = /l^^ , g 



and J5u[u|A* h ] is defined in 



Proof: For the lower-bound, we have: 

m 

Ev[v\A* th ] =f q +j2m-(i-Pj) k - h ) 

3=Q 

m 

>J2fj(l-e- k ^) (52) 



where k' — k — h — 9(m I?1 ). Using the taylor series, we 
obtain: 

m I I 

E v [v\A k qih ] > fjk' Pj + f jV} {k'p 3 f + f^{k' Pj f + ... 



J—Q 



= k' H ( r yc + lr,q,m) | 1 fc/2 H(2j c + j r ,q,m) 



H{ lr ,l,m)H{ lc ,l,m) 2! tf( 7r , l,m)H(7 c , 1, m) 2 

(53) 



l^ /3 ff(37 c + 7 r ,g,m) 



3! H(7 r ,l,m)if(7 C! l,m) 3 

Parts (i) and (ii) of lemma [TJ imply that all terms in the above 
equation are 9(^ r ). 



For showing the upper bound, 

m 

K[v\A* th ] = f q + ]T /i(l-(l-ftO" 



< fg + k fiPi 
3=1 
1 

< 



(54) 



H(-lc + lr,q,m) 



q^H(^ r ,l,m) H(j r ,l,m)H(j c ,l,m) 

(55) 

If we apply the results of lemma [T[ we can show that 

E v [v\Al h ] is O(^r). 



Lemma 3: For t < E v [v\A* h ], 
Pr[\v-E v [v\Al h }\ <t] > l-2exp(- 



2t 2 
(k-h)(f q ) 



(56) 



Proof: Function v : {1,2,..., m} — > R is equal to 

«(Wi,W 2 , . . . = /; 

where is the file that user j stores, ui = Uj =1 Ldj and Q = 
{q, q + 1, . . . , m}. v is the sum of popularity of union of files 
stored by users when only files in set Q are considered to be 
valuable. By replacing the ith coordinate by some other 
value the value of v can change at most by f q , i.e., 

Sup \v(ux, ...,LJ k )- v(uJx, ■ ■ ■ ■ ■ • , Wfe)| < fq 

Using Azuma-Hoefding inequality [20], 

2t 2 



Pv[\v-E v [v\A k q!h ]\>t] < 2ex pi 



(k-h)(f q y 



r) ( 57 > 
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